A Unified Model for Word Sense Representation and Disambiguation

نویسندگان

  • Xinxiong Chen
  • Zhiyuan Liu
  • Maosong Sun
چکیده

Most word representation methods assume that each word owns a single semantic vector. This is usually problematic because lexical ambiguity is ubiquitous, which is also the problem to be resolved by word sense disambiguation. In this paper, we present a unified model for joint word sense representation and disambiguation, which will assign distinct representations for each word sense.1 The basic idea is that both word sense representation (WSR) and word sense disambiguation (WSD) will benefit from each other: (1) highquality WSR will capture rich information about words and senses, which should be helpful for WSD, and (2) high-quality WSD will provide reliable disambiguated corpora for learning better sense representations. Experimental results show that, our model improves the performance of contextual word similarity compared to existing WSR methods, outperforms stateof-the-art supervised methods on domainspecific WSD, and achieves competitive performance on coarse-grained all-words WSD.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

رفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA

Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...

متن کامل

Latent Semantic Word Sense Induction and Disambiguation

In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. Th...

متن کامل

A Unified Multilingual Semantic Representation of Concepts

Semantic representation lies at the core of several applications in Natural Language Processing. However, most existing semantic representation techniques cannot be used effectively for the representation of individual word senses. We put forward a novel multilingual concept representation, called MUFFIN, which not only enables accurate representation of word senses in different languages, but ...

متن کامل

Improving Summarization of Biomedical Documents Using Word Sense Disambiguation

We describe a concept-based summarization system for biomedical documents and show that its performance can be improved using Word Sense Disambiguation. The system represents the documents as graphs formed from concepts and relations from the UMLS. A degree-based clustering algorithm is applied to these graphs to discover different themes or topics within the document. To create the graphs, the...

متن کامل

An Implementation of Incremental Disambiguation with Conceptual Hierarchy

Word sense ambiguity is one of the most important problems of natural language semantic interpretation. In traditional method, there is a combinatorial explosion problem due to the fact that the propriety of each word sense of an ambiguous word is independently checked with each word sense of other ambiguous words during the disambiguation process. In this paper, we describe an `incremental sem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014